NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration

Luo, Katie Z; Dao, Minh-Quan; Liu, Zhenzhen; Campbell, Mark; Chao, Wei-Lun; Weinberger, Kilian Q; Malis, Ezio; Fremont, Vincent; Hariharan, Bharath; Shan, Mao; et al (October 2025, IEEE)

Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. However, existing V2X datasets are limited in scope, diversity, and quality. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes collected from three connected autonomous vehicles (CAVs) equipped with two different configurations of LiDAR sensors, plus a roadside unit with dual LiDARs. Our dataset provides point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training. We provide detailed statistical analysis on the quality of our dataset and extensively benchmark existing V2X methods on it. Mixed Signals is ready-to-use, with precise alignment and consistent annotations across time and viewpoints. We hope our work advances research in the emerging, impactful field of V2X perception. Dataset details at https://mixedsignalsdataset.cs.cornell.edu/.
more » « less
Full Text Available
DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Mall, Utkarsh; Phoo, Cheng Perng; Chiquier, Mia; Hariharan, Bharath; Bala, Kavita; Vondrick, Carl (June 2025, CVPR)

Full Text Available
Scale-aware Recognition in Satellite Images under Resource Constraints

Revankar, Shreelekha; Phoo, Cheng Perng; Mall, Utkarsh; Hariharan, Bharath; Bala, Kavita (April 2025, ICLR)

Full Text Available
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos

Sun, Yihong; Hariharan, Bharath (December 2024, European Conference on Computer Vision)

Embodied agents must detect and localize objects of interest, e.g. traﬃc participants for self-driving cars. Supervision in the form of bounding boxes for this task is extremely expensive. As such, prior work has looked at unsupervised instance detection and segmentation, but in the absence of annotated boxes, it is unclear how pixels must be grouped into objects and which objects are of interest. This results in over-/under- segmentation and irrelevant objects. Inspired by human visual system and practical applications, we posit that the key missing cue for un- supervised detection is motion: objects of interest are typically mobile objects that frequently move and their motions can specify separate in- stances. In this paper, we propose MOD-UV, a Mobile Object Detector learned from Unlabeled Videos only. We begin with instance pseudo- labels derived from motion segmentation, but introduce a novel training paradigm to progressively discover small objects and static-but-mobile objects that are missed by motion segmentation. As a result, though only learned from unlabeled videos, MOD-UV can detect and segment mo- bile objects from a single static image. Empirically, we achieve state-of- the-art performance in unsupervised mobile object detection on Waymo Open, nuScenes, and KITTI Datasets without using any external data or supervised models. Code is available at github.com/YihongSun/MOD-UV.
more » « less
Full Text Available
ObjectCarver: Semi-automatic segmentation, reconstruction and separation of 3D objects

Hassena, Gemmechu; Moon, Jonathan; Fujii, Ryan; Yuen, Andrew; Snavely, Noah; Marschner, Steve; Hariharan, Bharath (March 2025, IEEE)

Full Text Available
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos

Sun, Yihong; Hariharan, Bharath (October 2024, European Conference on Computer Vision)

Full Text Available
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos

Sun, Yihong; Hariharan, Bharath (October 2024, ECCV 2024)
Leonardis, A; Ricci, E; Roth, S; Russakovsky, O; Sattler, T; Varol, G (Ed.)
Embodied agents must detect and localize objects of interest, e.g. traﬃc participants for self-driving cars. Supervision in the form of bounding boxes for this task is extremely expensive. As such, prior work has looked at unsupervised instance detection and segmentation, but in the absence of annotated boxes, it is unclear how pixels must be grouped into objects and which objects are of interest. This results in over-/under- segmentation and irrelevant objects. Inspired by human visual system and practical applications, we posit that the key missing cue for un- supervised detection is motion: objects of interest are typically mobile objects that frequently move and their motions can specify separate in- stances. In this paper, we propose MOD-UV, a Mobile Object Detector learned from Unlabeled Videos only. We begin with instance pseudo- labels derived from motion segmentation, but introduce a novel training paradigm to progressively discover small objects and static-but-mobile objects that are missed by motion segmentation. As a result, though only learned from unlabeled videos, MOD-UV can detect and segment mo- bile objects from a single static image. Empirically, we achieve state-of- the-art performance in unsupervised mobile object detection on Waymo Open, nuScenes, and KITTI Datasets without using any external data or supervised models. Code is available at github.com/YihongSun/MOD-UV.
more » « less
Full Text Available
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos

Sun, Yihong; Hariharan, Bharath (October 2024, European Conference on Computer Vision, 2024)

Full Text Available
MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos

Sun, Yihong; Hariharan, Bharath (September 2024, European Conference on Computer Vision)

Full Text Available
Learning 3D Perception from Others' Predictions

Yoo, Jinsu; Feng, Zhenyang; Pan, Tai-Yu; Sun, Yihong; Phoo, Cheng Perng; Chen, Xiangyu; Campbell, Mark; Weinberger, Kilian Q; Hariharan, Bharath; Chao, Wei-Lun (April 2025, International Conference on Learning Representations)

Accurate 3D object detection in real-world environments requires a huge amount of annotated data with high quality. Acquiring such data is tedious and expensive, and often needs repeated effort when a new sensor is adopted or when the detector is deployed in a new environment. We investigate a new scenario to construct 3D object detectors: learning from the predictions of a nearby unit that is equipped with an accurate detector. For example, when a self-driving car enters a new area, it may learn from other traffic participants whose detectors have been optimized for that area. This setting is label-efficient, sensor-agnostic, and communication-efficient: nearby units only need to share the predictions with the ego agent (e.g., car). Naively using the received predictions as ground-truths to train the detector for the ego car, however, leads to inferior performance. We systematically study the problem and identify viewpoint mismatches and mislocalization (due to synchronization and GPS errors) as the main causes, which unavoidably result in false positives, false negatives, and inaccurate pseudo labels. We propose a distance-based curriculum, first learning from closer units with similar viewpoints and subsequently improving the quality of other units' predictions via self-training. We further demonstrate that an effective pseudo label refinement module can be trained with a handful of annotated data, largely reducing the data quantity necessary to train an object detector. We validate our approach on the recently released real-world collaborative driving dataset, using reference cars' predictions as pseudo labels for the ego car. Extensive experiments including several scenarios (e.g., different sensors, detectors, and domains) demonstrate the effectiveness of our approach toward label-efficient learning of 3D perception from other units' predictions.
more » « less
Full Text Available

« Prev Next »

Search for: All records